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ABSTRACT 

An approach to using the computer to assemble German 
tests is described. The purposes of the system would be: (1) an 
expansion of the bilingual lexical memory bank to list and store 
idioms of all degrees of difficulty^ with frequency data and with 
complete and sophistica^ed retrieval possibility for assembly; {2) 
the creation of an item- synthesizing center similar to the analyzing 
center which permit the computer to take apart the German input; (3) 
the inclusion of statistical data, such as deltas, biserialSt, 
percentages, etc* ; to determine actual frequency of words as they 
occur in input; (5) to assemble vertical tests as well as 
parallel-horizontal tests; and (6) to enable the professional 
test^ production staff to concentrate their efforts on activities that 
are not likely to be performed by a con^uter, such as reviewing 
material, finding appropriate texts, witing questions for passages, 
etc. ••COMIT as and IR Language" by Victor Yngve is discussed 
briefly, and a sample machine translation from "An Introduction to 
Machine Translation" by Emile Delavenay is reproduced. (DB) 
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Present Classification System For German 

There is presently a classification system developed for German at ETS which 
could eventually be used to assemble German tests with computer assistance. 

Items are classified as to: 

1. Key Word(s) 

2. Sub-Category 

3. Case, Person, Number, Gender 

4. Matrix 

5. Nearer Function 

6. Farther Function 

When all items (tested, pre-tested or committee approved) are thus classified 
and encoded, the computer could presumably: 
Pull items thus saving valuable human labor 

Assure a spread of items in the area of syntax, idioms, phonetics, etc 
Determine difficulty levels according to recorded statistics (already on item 
cards) 

Control overlapping of test items within one test 
Avoid testing more than one difficulty per item 

However, all items have to be classified and encoded by staff or committee members 
which is time-consuming. Experience shows also that, despite the classification 
and encoding instructions, one and the same item may get slightly different treat- 
ment by two different people- The time that is saved later on, by computer assisted 
assembly, may in actuality be equal to the time put in by staff or committee orn- 
bers at the initial phase of classifying. 

A Differen t Approacii To Computer Mded Tests 

The starting point would be the most recent, advanced, linguistically sound system 
used for machine translation from a foreign language into English, for example: 
German to E/^.glish. Victor H. |Yngve describes such a system In his article en- 
titled ''COMIT as a IR Language" anU calls it a 'user-oriented general purpose 
symbol -manipulation programming language'', (see page 8) 
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In order to machine translate German into fZnglisli, ihc computer has already been 
programmed to analyze, classify, store and retrieve a German input. This input 
could be anywhere from a title, sentence, paragraph, lo a complete work. Such a 
computer program possesses a complete German-English memory bank. It has 
devices to handle idioms which are the most difficult phenomena to deal with in 
machine translation. 'Upon being fed an input in German, the computer analyzes, 
classifies, etc., the different pans of speech as to their role in the German 
sentence* 

Since we would not be interested in translat ion as an end product, the portion of 
the computer system devoted to the transfer of analyzed German (input) into 
equivalent English (output) could be modified for our pujposes (see page 9) 
Figure 1, representing a hypothetical translating machine and a sample of machine 
translation from French into English (see page 10) 

(1) One of the purposes would be: 

. An expansion of the bilingual lexical memory bank to list and store 
idioms of all degi^ees of difficulty, with frequency data and with 
complete and sophisticated retrieval possibility for assembly. 

The following example will illustrate the above function of the computer: 

A highly idiomatic German expression occurs: Es ist fiir die Katz(e) 
Literally translated (1) : It is for the cat 
In correct English : It is of no avail 

In idiomatic English : It is for the birds 

Tlie computer classifies this idiom according to noun categories: 

Es ist fiir die Katz(e) . The noun Katzc has a frequjncy code number (2) in the 
computer's lexical memory bank. Katze is a high frequency word occurring 
in most first year German courses- The idiom is not a high frequency item 
but it does occur in both spoken and written German and would possibly be 
learned in a third or fourth year German course. 



(1) Recently in the English translation of a German novel by a well-known author, 
the translator rendered the idiomatic expression: . . und es war fiir die Katz'* 
with 'and the cat got it'\ which amounts to total nonsense in English* 

(2) Katzc : Frequency, see J. Pfeffcr, Bnsic Sp oken G erman Word List, Prentice 
Hall, Inc. 1^64, page 27 
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Thc idiom Es ist fur die Katz (e) would be stored as follows: 

Kaize , idiomatic usage: Es ist fiir die Katz (e) 

Das ist fiir die Kat2(e) - alternate form of idiom 
It is for the cat - literal translation 

(nonsense) 

It is of no avail - correct Englisli 

It is for the birds - idiomatic English 

Having English equivalencies in the lexical memory bank is valuable because 
certain test items may be tested with English stems or options. 

Other idiomatic usages of the key word Katze would be stored under that noun 
category also, for example: 

Die Katze im Sack kaufen - to buy a pig in a poke 
Wie die Katze um den hcissen Brei gehen • to beat about the bush 
Die Katze aus dem Sack lassen - to let the cat out of the bag 
Die Katze lasst das Mausen nicht - what is bred in the bone will out in the 

flesh 

Our 'first purpose t!ien would be to expand the bi- lingual lexical memory bank 
to hold frequency data and to accomodate listings of idioms in such a way that 
they can be pulled just like any lexical entry. 

(2) Another purpose would be: 

The creation of an item- synthesizing center similar to the analysirdng 
center which permits the computer to take cipart the German input. 

For example: the existing system is able to analyze the following test item 
linguistically; 

Hast du daran gedaclit, . . . eine Einladung zu 
schicken? 

(A) unserem neuen Direktor 

(B) unseren neuen Direktor 

(C) unseres neuen Direktors 

(D) unser neuer Direktor 
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It is possible to imagine how ''frai.ics'' for oil majo • areas of language niulysis 
can be programmed so that tlie compute^ can fill ihcm from iis expanded lexical 
bank to produce unlimited amounts of items according to definile specifications. 

Example: 

Noun: frequency data - case - number - gender specificalions - testing 
specification 

Adjective: frequency data - case - number - gender specifications - testing 
specification 

Verb : frequency data - Tense - number - Voice - specifications - testing 
specification 

(3) A third purpose of the computer would be: 

The inclusion of statistical data sucl^. as shown on the item card below: 

Hast du da ran gedacht, . . . eine Einladung zu schicken? 

(A) unserem neuen Direktor 

(B) unseren neuen Direktor 

. (C) unseres neuen Direktors 
(D) unser neuer Direktor 
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Statistical data such as deltas, biscrials, percentages, etc, could become 
functional in the production and/or assembly of items. These data which 
importantly complement mere frequency data from the lexical bank, would 
be stored and retrieved upon instruction— just how cannot be discussed here. 

(4) A further use of the computer would be: 

To determine actual frequency of words, as they occur input, such as 
graded teaching texts (high school and college levels). These data 
could be used as comparative frequency data to frequency codes in the 
lexical bank of the computer (13 Q. Morgan' s German Freque n cy Word 
Book ) [ 1] . Other frequency data could be sought by feeding i he" com- 
puter input from nongraded material such as magazine articles, news- 
paper clippings, literary works of all types, etc. Scanning of texts on 
the basis of word frequency data (for difficulty levels) could thus be 
accomplished by the computer. It would serve as a pre- screening 
device for materials reviewed by test production personnel. 

(5) A computer equipped with word and idiom frequency data and statistical 
data could also assemble: 

vertical tests as well c s parallel-horizontal tests (see (2) pp. 3-5). 
Vertical tests would differ from parallel or horizontal tests in that 
they are not assembled to reproduce similar difficulty levels but 
produce tests which stare with easy items and become increasingly 
difficult in the different sections. 

Logically the computer could also produce parallel batteries of 
such tests. 

(6) At the stage of computer development and cost right now^ it seems unlikely 
a computer equipped with a symbol-manipulation programming language 
could be used for so small a field as German. However, if future prospects 
are as rosy as Emile Delavenay states then 'Translating machines will soon 
take their place beside gramophone and colour reproductions in the first 
rank of modem techniques for tlie spread of science and culture'' [2] • 

[1] J. Alan Pfeffer: Basic (Spoken) German Word List , Prentice- Hall, Inc 1964 

[2] Emile Delavenay: An Introduction to Machine Translation, F. A. Praeger, 
Publishers, New York, 1960 



ERiC 



7 



- 7 - 



If the cost factor is reduced and ways can be found to make such a language 
serve more than one foreign language in item production and test assembly 
(instead of serving the purpose of translation), the professional staff working 
on the production of tests (staff members, Committee members, item writers 
and others) could concentrate their efforts on the more rewarding tasks of 
reviewing material, finding appropriate texts, writing questions for passages 
and similar activities which are not likely to be performed by a computer. 

(7) The German Quarterly ^ reports that 

research group led by the German scholar Mans Eggers of the 
University of Saarbnicken has succeeded for the first time in 
syntactically analyzing modem German on a broad basis by means 
of an electronic computer. The results have been published in a 
report entitled Electronische Syntaxanalyse (Tubingen: Max Niemeyer 
Verlag, 1969). Eggers and his team spent several years in developing 
a computer program which enables them to take any German scnience, 
identify it grammatically and syntactically classify it.'' 

A study of the Saarbriicken research mentioned above might yield a basis 

for computer aided item generating. 



^^H^ol. XLITI, #4, November, 1970, pp. 837-38. 
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Appendixcd Sources 

Victor 1 1. Yngvc: '^COMIT as an IR Language" ^ ^ ^ 

Man}^ of the features that make COMIT a good all-around sj'nibol manipulation 
language also render it well suited to various types of information retrieval 
programs. Presented here is a general discussion of this unique and different 
programming language and an examinraion of some of its applications. . . . 
COMIT is available for the IBM 709 or 7090; a 704 version is partially checked 
out and could easily be put into shape. The system consists of a two-pass 
compiler that translates the COMIT language into a machine-oriented notation 
which is then run interpret ively. Compiler and intei'preter total about 16,000 
instructions. Although COMIT was only recently rcleasal through Share, it 
has been in exi^erimental use for some time and a number of problems have 
been programmed and run with the system. A list of sonic of the problem 
areas in which.COMIT programs have been written or are being written is as 
follows: mechanical translation routines, information retrieval research , 
vocabulary analysis, text processing- editing, random generation of sentences : 
automatic milling machine programming, sociological data reduction, simu- 
lation of human problem solving, simulation of games, theorem-proving and 
mathematical logic, logico-semantic investigations, electrical network 
analysis. 

COMIT promises to be especially useful for information retrieval problems 
for two reasons . Two of the most central built-in features are a simple 
scheme for dictionary searclrand a simple scheme of search using criteria 
such as class inclusion and context . The dictionary search scheme offers 
automatic alphabetization of the dictionary entries and a high-speed binary 
search at run time. The other search scheme , using criteria such as class 
inclusion and context , is a linear scan through a defined portion of the data 
looking for a condition of exact match or of inclusion. This ''workspace 
search** can be easily used for searches based on descriptors or other more 
complicated schemes using local or distant contest. . . . 

^^Coinmunicntions of the ACiVl, Vol. 5, -1, Jamiary 1962, pp. 19-21. 
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COMIT as an IR Language (Continued) 

The workspace constituent therefore is the basis entity in which daf.a is 
stored. A constituent may represent a letter or a word of text; ii inay 
represent an algebraic symbol; or it may represent a rcu jeval item consist- 
ing of a document name, a document number, and a set of descriptors. A 
retrieval item is typical of a COMIT constituent and can serve an an illus- 
tration of the internal structure of a constituent. A constituent consists of a 
symbol and optional subscripts- There may be one nuineiical subscript and 
any number of logical subscripts. • • • 

COMIT is decidedly not, however, one of the programnnng languages that 
allows one to ''program in English*'- But it is a programming language that 
takes advantage of intuitive feelings of naturalness that stem from the user's 
fluency in a natural language. . . . 



A Framework For Syntactic Translation, V.H. Yngve, Mechanical Translation 

1957, V.A No. 3 pp. 59-65 
Syntactic Translation 63 
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Sample Machine Tj anslation 

Eniile Delavenay: An Introduclion to Machine Trans]ation^ ^ ^ 



LA RECIIERCME A l^ROGRESSEi DE 
FACON SPECTACULAIRE ]3EPi;iS 
$FIG 1955 , . . DES TRADUCTIONS 
UTILES SONT FAITES PAR DES 
MACHINES ET LEUR NOME RE IRA 
EN CROISSANT , LEUR QUALITEL S 
AMEILIORERA CONSTAMMENT. 



T!IE RESEARCH HAS PROGRESSED 
IN A SPECTACULAR MANNER SINCE 
1955 /COLON/ USEFUL TRADUCTIONS 
ARE DONE BY MACHINES AND THEIR 
NUMBER WILL INCREASE CONTINU- 
OUSLY , THEIR QUALITY WILL IM- 
PROVE ITSELF CONSTANTLY. 



Fig. 5. A Specimen of Machine 
Translation 



(a) A Foreword to this book, as typed 
out in its original French in the 
course of its mechanical trans- 
lation on I. B- M. 784 computer. 
This Foreword was written for the 
sole purpose of being so translated. 
See page 119 for an explanation of 
figures in words and other con- 
ventional symbols. 



(b) Reproduction of the actual machine- 
translation of the same Foreword, 
as typed out by the I. B. M. 704 
compu'xr in Paris on 19th June 1959. 
The French-to-linglish translation 
programme used was conceived and 
designed by Mr. A. F. R. Brown of 
Georgetown University for Ihe 
translation of texts on chemistry 
and nuclear energy. A fuller ex- 
planation will be found on page 119. 
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